feat(rust): Replace raw string production to pyo3 API#27271
Draft
tylerriccio33 wants to merge 12 commits intopola-rs:mainfrom
Draft
feat(rust): Replace raw string production to pyo3 API#27271tylerriccio33 wants to merge 12 commits intopola-rs:mainfrom
tylerriccio33 wants to merge 12 commits intopola-rs:mainfrom
Conversation
# Conflicts: # crates/polars-mem-engine/src/planner/lp.rs # py-polars/src/polars/io/pyarrow_dataset/anonymous_scan.py # py-polars/tests/unit/io/test_pyarrow_dataset.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Draft/WIP for now!
This closes #26591 and supersedes the code from #24255 .
Follow up to #27091
This PR takes code that builds raw python strings pushed down and evaluated using ast.eval. I can't use this area of the codebase at work since it uses eval. It does so by converting the polars expressions to arrow predicates and then from predicates represented in rust to python objects via pyo3. These objects are used in the scanning of arrow datasets at the python level.
In a future PR I'd like to remove the string conversion entirely, it's currently used as a caching key. I originally mocked out swapping that out but I felt it was out of scope.
In a future PR I'd like to do this for the iceburg scans too (I'll really need this at work), which can leverage the arrow conversion layer.
I used AI for:
I'd appreciate feedback/critique if there are better ways to accomplish things, especially in rust!